{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "id": "BN3AqkFOzaSM"
   },
   "source": [
    "# OmniSafe Tutorial - CLI Command\n",
    "\n",
    "OmniSafe: https://github.com/PKU-Alignment/omnisafe\n",
    "\n",
    "Documentation: https://omnisafe.readthedocs.io/en/latest/\n",
    "\n",
    "Safety-Gymnasium: https://www.safety-gymnasium.com/\n",
    "\n",
    "[Safety-Gymnasium](https://www.safety-gymnasium.com/) is a highly scalable and customizable Safe Reinforcement Learning library, aiming to deliver a good view of benchmarking Safe Reinforcement Learning (Safe RL) algorithms and a more standardized setting of environments. \n",
    "\n",
    "## Introduction\n",
    "\n",
    "In this section, you will be introduced to the CLI tool of OmniSafe, which allows you to completely avoid dealing with code. You can call OmniSafe through command-line commands, even if you have no knowledge of programming. With the information presented in this section, you can easily become proficient in using OmniSafe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "oDTmwX4d5jmR"
   },
   "outputs": [],
   "source": [
    "# Install from PyPI. if you already have OmniSafe installed, please ignore the code in this cell.\n",
    "!pip install omnisafe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "AbGRUmvz5nGL"
   },
   "outputs": [],
   "source": [
    "# Install OmniSafe and dependencies from source\n",
    "## if you already have OmniSafe installed, please ignore the code in this cell.\n",
    "## Clone the repo\n",
    "!git clone https://github.com/PKU-Alignment/omnisafe\n",
    "%cd omnisafe\n",
    "\n",
    "## Install OmniSafe\n",
    "!pip install -e ."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "NHf5aG3d5qZS"
   },
   "source": [
    "## CLI Usage Examples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WFANT3Fr6wtm"
   },
   "source": [
    "### Quick experiment\n",
    "\n",
    "The `train` command aims to provide you with a quick interface to run experiments.\n",
    "\n",
    "By using the following command, you can start training with default parameters, and conveniently modify some common parameters through the command line."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "5tvIl0fS63Cl"
   },
   "outputs": [],
   "source": [
    "!omnisafe train --algo PPO --total-steps 2048"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "QGbJe-ca88NM"
   },
   "source": [
    "If you want to further understand the usage of the `train` command, simply use `--help`。\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "A2wGFbyN9DYE"
   },
   "outputs": [],
   "source": [
    "!omnisafe train --help"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "id": "YKukyW-g9dWm"
   },
   "source": [
    "For uncommon parameters, we also support modifications. However, this goes beyond the original intention of the `train` command, which was designed for quick and easy experimentation. Therefore, if there is a need to modify a large number of uncommon parameters, please refer to the `train-config` section below.\n",
    "\n",
    "In the example below, we modified the value of `steps_per_epoch` belonging to `algo_cfgs` in the default parameters via the command line.\n",
    "\n",
    "**Please note**:\n",
    "\n",
    "+ For nested parameter key-value pairs, use a colon `:` to connect them.\n",
    "+ To modify a parameter, the corresponding key-value pair must exist in the previously defined `YAML` file.\n",
    "+ To save the YAML file that contains the default parameters for each algorithm in OmniSafe, please refer to [this link](https://github.com/PKU-Alignment/omnisafe/tree/main/omnisafe/configs).\n",
    "+ To modify uncommon parameters, use `--custom-cfgs`. For this, you must provide the key-value pairs according to their respective relationships and precede each string with the `--custom-cfgs` flag. For example: `--custom-cfgs key1 --custom-cfgs value1 --custom-cfgs key2 --custom-cfgs value2`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "aWugQFB_-FOj"
   },
   "outputs": [],
   "source": [
    "!omnisafe train --algo PPOLag --env-id SafetyHumanoidVelocity-v1 --total-steps 2048 --custom-cfgs algo_cfgs:steps_per_epoch --custom-cfgs 1024"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "A0oOSrioiWbv"
   },
   "source": [
    "### Configure your experiments\n",
    "\n",
    "The `train-config` command reads the parameters you provide from the YAML file path you specify and incrementally updates them on top of the default parameters.\n",
    "\n",
    "With the `train-config` command, you can provide more detailed and specific parameters without having to deal with a huge number of parameters on the command line, which can be prone to errors.\n",
    "\n",
    "**Please note**:\n",
    "\n",
    "+ You need to create your own parameter file, which must follow the same format as the default parameters. For algorithms supported in the library, you can refer to the file format [here](https://github.com/PKU-Alignment/omnisafe/tree/main/omnisafe/configs).\n",
    "+ The configuration file can be created in any location, just provide its path when calling the `train-config` command on the command line. We recommend that you define it in the current working directory because any files generated during the execution of your command will also be saved here.\n",
    "Here is an example where we will run the previous experiment again in a different way.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "finO392PkJqi"
   },
   "outputs": [],
   "source": [
    "!mkdir ./save_config\n",
    "%cd ./save_config"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "fhmXec87l9lt"
   },
   "source": [
    "Next, define your experiment parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "D05bUldkmCit"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "echo \"algo:\n",
    "  PPOLag\n",
    "env_id:\n",
    "  SafetyAntVelocity-v1\n",
    "train_cfgs:\n",
    "  total_steps:\n",
    "    1024\n",
    "  vector_env_nums: 1\n",
    "algo_cfgs:\n",
    "  steps_per_epoch:\n",
    "    1024\" > config.yaml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "I3I-D-zDpZsT"
   },
   "source": [
    "Next, simply specify the directory of the configuration file for the `train-config` command."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "INH3AIZZpYzl"
   },
   "outputs": [],
   "source": [
    "!omnisafe train-config ./config.yaml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "sQmvE0ZqqYGM"
   },
   "source": [
    "Similarly, you can use the `--help` option to view more related information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "RGH4stQJqh3r"
   },
   "outputs": [],
   "source": [
    "!omnisafe train-config --help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "BANIxfB7qrE1"
   },
   "source": [
    "### Render and evaluate your experiments\n",
    "\n",
    "\n",
    "In the CLI, we also provide support for visualizing training data through the `eval` command. The `eval` command supports flexible invocation with multiple parameters, and you can get detailed explanations by using the `--help` option."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "33FepsLErVmA"
   },
   "outputs": [],
   "source": [
    "!omnisafe eval --help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "VK1CFzbWr6Vl"
   },
   "source": [
    "\n",
    "\n",
    "Next, we will perform rendering and evaluation on the previous experiment run, and the visual results will be saved in the same directory as the experiment data.\n",
    "\n",
    "**Please note**:\n",
    "\n",
    "+ The `--no-render` option can be used to turn off visualization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "jMJZZH43sgeb"
   },
   "outputs": [],
   "source": [
    "%ls\n",
    "%cd ./train_dict\n",
    "%ls"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "id": "-Y9NJgUbt9rh"
   },
   "source": [
    "**Please note**: \n",
    "\n",
    "+ Rendering may encounter problems on servers without a display. Please try the following steps. If you encounter any problems on your own machine, please create an [issue](https://github.com/PKU-Alignment/omnisafe/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) on GitHub for OmniSafe to let us help you resolve the issue."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "oDIbJfEJuHpJ"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "apt-get install libosmesa6-dev\n",
    "apt-get install python3-opengl"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "nPpSlZhb7oWw",
    "outputId": "7b7c077f-c32d-4963-dfa7-297bb6f90bd8"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "env: MUJOCO_GL=osmesa\n",
      "env: PYOPENGL_PLATFORM=osmesa\n"
     ]
    }
   ],
   "source": [
    "%env MUJOCO_GL=osmesa\n",
    "%env PYOPENGL_PLATFORM=osmesa"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "x4e9_eestRj4"
   },
   "outputs": [],
   "source": [
    "!omnisafe eval ./PPOLag-{SafetyAntVelocity-v1} --num-episode 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "aQeHlYP18NFN"
   },
   "source": [
    "Let's check the generated video."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "V-2RBJqs8UvN"
   },
   "outputs": [],
   "source": [
    "import base64\n",
    "from pathlib import Path\n",
    "\n",
    "from IPython import display as ipythondisplay\n",
    "\n",
    "\n",
    "def show_videos(video_path='', prefix=''):\n",
    "    \"\"\"\n",
    "    Taken from https://github.com/eleurent/highway-env\n",
    "\n",
    "    :param video_path: (str) Path to the folder containing videos\n",
    "    :param prefix: (str) Filter the video, showing only the only starting with this prefix\n",
    "    \"\"\"\n",
    "    html = []\n",
    "    for mp4 in Path(video_path).glob(\"{}*.mp4\".format(prefix)):\n",
    "        video_b64 = base64.b64encode(mp4.read_bytes())\n",
    "        html.append(\n",
    "            '''<video alt=\"{}\" autoplay \n",
    "                    loop controls style=\"height: 400px;\">\n",
    "                    <source src=\"data:video/mp4;base64,{}\" type=\"video/mp4\" />\n",
    "                </video>'''.format(\n",
    "                mp4, video_b64.decode('ascii')\n",
    "            )\n",
    "        )\n",
    "    ipythondisplay.display(ipythondisplay.HTML(data=\"<br>\".join(html)))\n",
    "\n",
    "\n",
    "# Please fill the path of folder containing your video which is shown above here\n",
    "show_videos(video_path='')"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "id": "bC1gRloPu-rF"
   },
   "source": [
    "### Quick benchmarking and large-scale experiment\n",
    "\n",
    "\n",
    "\n",
    "`experiment_grid` is an incredibly useful tool. With it, you can quickly test all feasible parameter combinations that influence algorithm performance. Typically, `experiment_grid` is only needed when a significant number of parameters must be specified, and as such, it can only be invoked through a configuration file. Please refer to [this link](https://github.com/PKU-Alignment/omnisafe/blob/main/examples/benchmarks/example_cli_benchmark_config.yaml) for the configuration file format.\n",
    "\n",
    "In essence, you must specify the key-value pairs of the parameters, and if nested key-value pairs are present, use a colon : to separate them. All feasible values must be enclosed in square brackets `[]`, and indentation must adhere to the YAML file format requirements.\n",
    "\n",
    "**Please note**:\n",
    "\n",
    "+ We recommend that you set the `num-pool` value to a number that evenly divides the total number of experiments, ensuring that the number of tasks executed at any given time is roughly the same during the entire experiment process.\n",
    "+ `exp_name` specifies the directory in which all of your experiments for this run will be located. To prevent file conflicts between two experiments, you must explicitly provide a value for `exp_name` each time you run `experiment_grid`.\n",
    "\n",
    "\n",
    "In the OmniSafe CLI tool, you can invoke `experiment_grid` using the `benchmark` command.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Amjgyq2gc8Av"
   },
   "source": [
    "Certainly, you can also obtain further usage instructions by using the `--help` option."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "0vTStdXIdEPO"
   },
   "outputs": [],
   "source": [
    "!omnisafe benchmark --help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SlponENqd_gi"
   },
   "source": [
    "Next, let's run a set of experiments using the `benchmark` command. This will make your work quite efficient when a large number of parameters need to be adjusted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "huSt8vCSfhtY"
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "echo \"algo:\n",
    "  ['PolicyGradient', 'NaturalPG']\n",
    "env_id:\n",
    "  ['SafetyAntVelocity-v1']\n",
    "logger_cfgs:use_wandb:\n",
    "  [False]\n",
    "train_cfgs:vector_env_nums:\n",
    "  [2]\n",
    "train_cfgs:torch_threads:\n",
    "  [1]\n",
    "train_cfgs:total_steps:\n",
    "  4096\n",
    "algo_cfgs:steps_per_epoch:\n",
    "  2048\n",
    "seed:\n",
    "  [0]\" > grid_config.yaml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Qw5e3XsWf2z5"
   },
   "source": [
    "To run the command, you must:\n",
    "\n",
    "1. Provide an experiment name.\n",
    "2. Provide the maximum number of processes that can be run simultaneously in this experiment.\n",
    "3. Provide the path to the configuration file containing the parameter settings.\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "9p_KydAUf1lS"
   },
   "outputs": [],
   "source": [
    "!omnisafe benchmark test_grid 2 ./grid_config.yaml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "f34rBm9Ug4ep"
   },
   "source": [
    "### Quick analyzing the results\n",
    "\n",
    "\n",
    "\n",
    "In the `train` and `train-config` commands, you can directly specify whether to plot the training curve, visualize the model during the process, or evaluate the model's performance using the `--plot`, `--render`, and `--evaluate` options.\n",
    "\n",
    "However, in `benchmark`, due to the significantly greater complexity of the analysis compared to a single experiment, we have introduced a separate `analyze-grid` command to analyze the results.\n",
    "\n",
    "**Please note**:\n",
    "\n",
    "+ `render` and `evaluate` are indistinguishable to the model during the process, as they both involve testing the model in different environment instances. You can think of rendering as a form of evaluating that simultaneously generates a video.\n",
    "+ During testing, the condition `deterministic==True` applies to the network.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "imSYTMpKiyQP"
   },
   "source": [
    "\n",
    "Next, we will introduce the `analyze-grid` command. By using the `--help` option, you can see the following parameters that can be specified:\n",
    "\n",
    "+ `path`: indicates the path where a set of `experiment_grid` experiments are saved. As long as there is a complete experiment folder started by `experiment_grid` under the given directory, the program will automatically recursively search for its specific location.\n",
    "\n",
    "+ `parameter`: specifies the name of the parameter whose effect you want to analyze through performance comparison. If there are nested parameters, please use : to separate them.\n",
    "\n",
    "+ `compare_num`: indicates the maximum number of different parameter values you want to see on the same graph when analyzing the effect of the parameter. OmniSafe will automatically generate all such combinations. For example, if there are three algorithms and only one value for each of the other parameters, and `parameter='algo' compare_num=2`, you will get three graphs, each comparing the performance of two algorithms under the current environment and parameters. This can greatly reduce your mental burden during comparison.\n",
    "\n",
    "+ `cost_limit`: the black line displayed in the cost part of the performance graph, indicating the value of the `cost_limit` that should be followed.\n",
    "\n",
    "You may have noticed that in the previous section, users can also specify the `values` option to select specific parameter values for comparison. Due to the limitations of command line operations, we do not support this option here. If you need to use it, you can create a Python script to call the statistics tools to implement it."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "99wH31s-lxg2"
   },
   "source": [
    "Next, we will analyze the experiment that was run above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "kJG87jyMl3Rs"
   },
   "outputs": [],
   "source": [
    "!omnisafe analyze-grid --help"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qy3iI_TT9jAP"
   },
   "source": [
    "Regarding the specific usage of the data analysis tools, a more detailed introduction was given in the first section. Here, we only show how to call them through CLI.\n",
    "\n",
    "To generate performance comparison graphs for PPO and PolicyGradient in different environments in the above experiment grid, you can use the following CLI command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "VonxbBLy9iXT"
   },
   "outputs": [],
   "source": [
    "!omnisafe analyze-grid ./benchmark/test_grid algo --compare-num 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hjVsp-UXmBrG"
   },
   "source": [
    "### Explore\n",
    "OmniSafe is a young and constantly growing library, and we will continue to improve its functionality and optimize the user experience. This section aims to cover the core usage of OmniSafe's CLI tools. You can look forward to new interactive ways and features in the future. If you have any better insights during use, please feel free to contact us. We will carefully listen to your suggestions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "fY4zSf60m6Bi"
   },
   "source": [
    "By using `--help`, you can continue exploring OmniSafe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "YwfeK-DOm-6J"
   },
   "outputs": [],
   "source": [
    "!omnisafe --help"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
